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Abstract 

Introduction. This study sought to answer three questions: 1) Would the level of 
domain knowledge significantly affect the user's search behaviour? 2) . Would the level 
of domain knowledge significantly affect search effectiveness, and 3) . What would be 
the relationship between search behaviour and search effectiveness? 

Method. Participants were asked to rate their familiarity with 200 thesaurus terms to 
measure their level of domain knowledge. They also searched on three assigned topics 
using the COMPENDEX database. Data were collected through pre- and post-search 
questionnaires, thesaurus term rating form, computer logs, and search session 
printouts. 

Analysis. Twenty-two engineering and science students' data were analysed both 
quantitatively and qualitatively. Quantitative analysis included both descriptive 
statistics and statistical testing, while the qualitative analysis was on the use of terms 
in queries. 

Results. As the level of domain knowledge increases, the user tends to do more 
searches and to use more terms in queries. However, the search effectiveness remained 
the same for all participants. 

Conclusion. The level of domain knowledge seems to have an effect on search 
behaviour, but not on search effectiveness, and search behaviour does not seem to be 
related to search effectiveness. The findings are limited by the small sample size and 
need to be confirmed in further studies. 


Introduction 

The enormous amount of digital information accessible today poses a great challenge to 
infomiation retrieval systems to retrieve effectively the information the user needs. To 
design better, more effective retrieval systems, we need to understand users: what factors 
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affect their search behaviour, search strategy, and the effectiveness of their searches? User 
characteristics as a contextual factor need to be investigated. 

Among many user characteristics, the user's subject domain knowledge is considered an 
important factor that affects the user's information seeking behaviour and search 
performance ( Allen 1991b ). Subject domain knowledge is the 'knowledge that users have 
of the topic being searched, or of the general subject area from which that topic is drawn' 
( Allen 1991b : 11). It can be considered as the contextual or the background information a 
user has about the topic. Information retrieval system designs should take this contextual 
factor into consideration when personalizing the system for a user. 

In this study, we investigate the effects of domain knowledge on users' search behaviour 
and search effectiveness. Our goal was to examine if users who were more knowledgeable 
in a field or about the topic to be searched would perform better in searching than the users 
who had less knowledge in the field. Instead of comparing the effects from different fields 
or domains or different contexts, we were interested in exploring how the amount, or the 
level of, domain knowledge in a particular field would impact the user's search. The 
context we chose is heat and thermodynamics in engineering and physics. We sought to 
answer three research questions: 

1. Would the level of domain knowledge in this field have a significant effect on the user's search 
behaviour during the search process? Since domain knowledge is related to terminologies and 
vocabulary in a field ( Allen 1991 a t. it is reasonable to assume that there would be a connection 
between the level of domain knowledge and the terms the user would select for use in the query 
formulation while performing the search. 

2. Would the level of domain knowledge in this field have a significant impact on the effectiveness 
of searches or search performance? We assumed that the more knowledge a person has, the 
more familiar the user will be with the search question, and this familiarity would lead to two 
positive things during the search process: 1) the user would be more capable of formulating an 
effective query, and 2) the user would be more capable of identifying relevant documents. 
However, this remains a question, as Bhavnani ( 2002 1 asked: 'While domain-general knowledge 
may be important, is it sufficient for effective search?' We wanted to find out if this would be the 
case. 

3. What would be the relationship between the search behaviour and search effectiveness? Would 
the search behaviour exhibited by the users with a higher level of domain knowledge lead to a 
more effective search? We assumed that users with different levels of domain knowledge related 
to a topic would demonstrate different kinds of search behaviour, and this would result in 
different search effectiveness. 

Of the three research questions, we were particularly concerned with the search 
effectiveness issue, because that is the ultimate goal for system designs. 

This paper reports the results from our study. We first review the related literature. We then 
discuss the methods we used in the study. The results are presented next, and finally we 
discuss the implications of the findings. 

Related research 

The effect of domain knowledge on database searching has been studied from various 
aspects. 

Borgman ( 1989 1 examined individual differences in information retrieval in terms of 
personal characteristics, technical aptitudes, and academic orientation and concluded that 
these factors were interrelated. Yee ( 1993 1 compared search strategies of graduate students 
in library science and education when searching in both their own domain and the other 
domain. For the students in education there were no differences in search behaviour when 
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conducting searches on familiar versus unfamiliar topics. The library science students took 
more time to prepare off-line the search on education administration and they spent more 
time to evaluate the results in this unfamiliar field as opposed to the time they spent to 
conduct the search in their own field. The library science students used more thesaurus 
terms and more synonyms in the search on education. Marchionini et al, ( 1993 ) compared 
domain experts with intermediary search experts. Their work revealed that domain experts 
were content-driven, focusing on the answers to the search questions and had clear 
expectations for the answer to be found while search experts were problem-driven, 
focusing on the problem statement and the query formulation. Kiestra et al, ( 1994 ) 
conducted an experiment on identifying the impact of system and domain knowledge on 
search behaviour in an online catalog of twenty-nine students who performed equal 
numbers of searches in familiar and unfamiliar domains. The results indicated that domain 
knowledge had a significant effect in only one of three analyses concerning search time. 

The above-mentioned studies have one thing in common: that is, they either compared the 
effects of domain knowledge between different fields or with search knowledge. The 
effects of the level of knowledge within the field were not investigated. 

Several studies have been conducted to investigate the effects of the level of domain 
knowledge on searches. For example, Wildemuth ( 2004 1 investigated the effects of domain 
knowledge on the formulation of the user's search tactics. She examined the tactics of a 
total of seventy-seven medical students searching a factual database in microbiology over a 
nine-month period and found that the search tactics changed over time as the students' 
domain knowledge changed. Vakkari et al, ( 2003 1 conducted a longitudinal study 
investigating how twenty-two psychology students' growing understanding of the topic and 
search experience were related to their search tactics and terms while preparing a research 
proposal for a small study. Based on the results, the authors concluded that domain 
knowledge has an impact on searching assuming that users have a sufficient knowledge 
about the system used. Allen (1991a) found that there was a relationship between the level 
of domain (topic) knowledge and recall in searches in an online library catalogue. The 
study revealed that high-knowledge users in the subject area of Voyager 2 exploration of 
Neptune had a greater familiarity with the vocabulary of the topic. These studies, however, 
mainly concentrated on the search behaviour part, that is, the terms and patterns used. They 
did not investigate search effectiveness issue. The question of search effectiveness remains: 
would such search behaviour lead to an effective search? 

Previous studies on users in the engineering research context provide little detailed 
information about the user's search behaviour that is specifically related to information 
retrieval systems during the search process. Fidel and Efthimiadis ( 1999 ) investigated the 
information seeking and searching behaviour of engineers at Boeing when they search for 
information on the Web. This study interviewed nine engineers and observed their 
searching during their daily work period. The results revealed some common search 
patterns. For instance, all participants use relevance and reliability as the most important 
factors when collecting task-related information from the Web; they narrow a search more 
frequently than they broaden it; they all choose ease of use as the most important criterion 
in selecting a method for searching information. Cheuk & Dervin (1999) studied 
information seeking and use by three groups: auditors, engineers and architects. Ten 
information seeking situation types, including task initiating situation, focus forming 
situation, idea assuming situation, idea confirming situation, etc., are identified. Compared 
to other groups, the results showed that engineers reported more frequently in idea 
rejecting situations, which are defined as the situations in which they were unable to 
understand conflicting and unexpected information. Most engineers believed that they 
spent most of their time investigating the causes of product failures, and exploring why 
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testing results were unexpected, which seem to be related to the intention or purpose of the 
search. Ellis & Haugan ( 1997 ) studied the information seeking patterns of engineers and 
research scientists at Statoil's Research Centre, in Trondheim, Norway. The study 
identified similar behavioural characteristics of scientists and engineers. These 
characteristics are: surveying, chaining, monitoring, browsing, distinguishing, filtering, 
extracting, and ending. The results of these studies are too general to answer the research 
questions we sought to answer in our study. 

We felt additional research is needed to determine the roles of domain knowledge in 
searching. We are particularly interested in learning how engineering and science students, 
including both undergraduate and postgraduate students, do searches and what are the 
relationships between their level of domain knowledge, their behaviour when searching, 
and the effectiveness of their searches. 

Research methods 

In order to answer the three research questions, we needed to measure the involved 
variables and to compare the effects of the variables from different users. An experimental 
approach is thus the appropriate choice. 

Variables and measures 

Three variables are involved in the study: the level of subject domain knowledge, user's 
search behaviour, and the effectiveness of the user's search. Subject domain knowledge 
serves as the independent variable, and the other two serve as the dependent variables. 

Subject domain knowledge elicitation and representation 

The subject domain, or information seeking context, chosen for this study is heat and 
thermodynamics in engineering and physics. This field was chosen because it is the 
co mm on fundamental knowledge area of engineering and physics. The search topics, as 
will be described later in the Instruments subsection, are the applications of the theories in 
this field in the automobile industry. 

In this study, we used a thesaurus to elicit and represent the users' level of domain 
knowledge because thesauri are widely used as the tool for indexing and representing 
information in information retrieval systems and as the search-aid for users t Paice 1991 : 
Kristensen 1993 ). In particular, we used the Engineering Information Thesaurus, 2nd 
edition (1995) because it is used by the database COMPENDEX, which was the database 
used in this study. We used the Heat and Thermodynamics class in the thesaurus. We put 
each of about 200 terms in the Heat and Thermodynamics class on a five-point scale of 
familiarity and asked our participants to rate each term. The level of subject domain 
knowledge was measured as the participants' self-reported ratings of familiarity with these 
terms. 

To make sure that what we measured is the participants' familiarity with the field, 
represented by the terminology, rather than the structure of the controlled vocabulary, we 
did not include any term relationship information in the rating instrument, and the terms 
were ordered alphabetically. Therefore, the participants could rate the terms without any 
knowledge of the controlled vocabulary. 

It should be pointed out that there are many other ways to measure a person's level of 
domain knowledge. For example, by using a test that is standard in the field; by evaluating 
the person by domain expert(s) through interviews, and so on. For practical reasons, we 
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chose to use ratings on thesaurus terms. 

Search behaviour 

This is the information searching behaviour as defined by Wilson ( 2000 ): the micro-level of 
behaviour when a user interacts with a specific information retrieval system to search for 
relevant information. In this study, this is measured by the number of searches (queries), 
the number of words in a query, and the number of thesaurus terms used in query 
formulation. 


Search effectiveness or performance 

We use the single value Mean Average Precision (MAP) score ( Baeza- Yates & Ribeiro- 
Neto 1999 : 80) for each participant, as well as the total number of relevant documents 
identified by each participant as the measures of search effectiveness. 

MAP is the average of precision figures obtained after each new relevant document is 
observed in the system's ranked result list and has been used as a major measure in the Text 
REtrieval Conferences (TREC) to measure the performance of different systems. This 
measure takes into account the number of the relevant documents retrieved and the ranking 
of these relevant documents in the result list. For a single topic it is the mean of the 
precision obtained after each relevant document is retrieved. For multiple topics, it is the 
mean of the average precision scores of each of the topics in the experiment. We chose to 
use this measure because, for an effective search, it is important that the participant should 
not only find relevant documents but also be able to use a query that can have the relevant 
documents ranked high in the result list. 

System 

COMPENDEX, one of the most frequently used engineering databases, is used as our 
search system. COMPENDEX is available through the Axiom's Web-based database 
service ( now part of ProComm ). which allows users to perform simple and expert searches. 

Participants 

Our initial experimental design intended to have twenty-eight participants in two groups: 
fourteen undergraduate students and fourteen postgraduate students, representing less 
knowledgeable users and knowledgeable users, respectively. The total number of twenty- 
eight was targeted partly because of the limited funds available for the research (a larger 
number could not be afforded) and partly for a practical reason: the engineering class from 
which we recruited the undergraduate students had only fourteen students. For balance, 
fourteen postgraduate students were to be recruited from engineering and sciences schools. 

Flowever, two of the undergraduate students could not participate for various reasons. For 
the graduate students, because of the time constraint (during a semester), we were able to 
recruit twelve. Of the twelve, three participants' search data had errors (they searched 
different databases and a different time period) and the data cannot be used in the analysis. 

We finally had twenty-two students who participated in the study. Thirteen were 
undergraduates and nine were postgraduates. It is assumed that these students have 
different levels of domain knowledge relating to the subject field of this study. Each 
participant was paid $20 after he or she completed the whole search process. 


http://www.informationr.net/ir/10-2/paper217.htmlfl 1/12/2015 5:01:21 PM] 






Domain knowledge, search behaviour, and search effectiveness of engineering and science students: an exploratory study 


Data collection 

Data were collected through a user questionnaire, a Thesaurus term rating form, a post- 

search questionnaire, computer logs, and the printout of records of search sessions. 

The user questionnaire ( Appendix 1 ) was used to obtain demographic information from the 
subjects, such as, sex, degree level, search experience, etc. The thesaurus term rating form 
was used to elicit participants' knowledge of the subject domain. About 200 terms from the 
Heat and Thermodynamics section of the El Thesaurus were associated with five-point 
scales, from 'know nothing about the term' to 'very familiar with the term itself and its 
relationships with other terms'. Only terms indended for use in indexing were included, i.e., 
those terms with the USE reference were ignored. The relationships among the terms were 
ignored in the rating form because we concentrated on measuring the level of domain 
knowledge, rather than the user's familiarity with the thesaurus itself. 

The Post-search questionnaire ( Appendix 2 ) was used to elicit the information about the 
search process, such as, how the relevance judgments were made, how the search queries 
were refined, and whether the participant is satisfied with the results. 

The Computer logs were used to save the participants' search history and search results. 
With the intent to keep a hardcopy record, a printout of the search history for each subject 
was also generated, and this became a valuable source of our data. 

Search tasks 

Three search questions were given to the participant. These tasks were generated by the 
instructor of the engineering class from which the undergraduate participants were drawn, 
and were used for class projects. They are purposely designed to elicit such information as 
subject domain, information searching, and search results evaluation. The search questions 
are included in Appendix 1 : 

Procedures 

The experiment was conducted in a computer laboratory on campus. Before the participant 
came to the laboratory to perform searches, the participant was asked to complete the 
thesaurus term rating form. At the beginning of the experiment, the participant was asked 
to fill out the user questionnaire and then to perform searches on the three search 
questions. All participants used the same search questions. After each search, s/he 
completed the post-search evaluation questionnaire. This step continued until all the three 
questions were finished. The whole process took about two hours. 

Results 

Ratings of thesaurus terms 

Ratings of the familiarity on the selected thesaurus terms by the participants range as low as 
0.93, on a five-point scale from 0 to 4, and as high as 3.4. Based on the ratings, the 
participants were divided into two groups: a low-rating group, which is considered as the 
low-level domain knowledge group, and a high-rating group, as the high level domain 
knowledge group. It is reasonably assumed that those who had high ratings are more 
familiar with the terms and, therefore, are more knowledgeable in this field. The 
benchmark data for the ratings is presented in Table 1. A significant difference is found 
between the two groups by the paired /-test, with p <0.000. 


Range of Mean term No. of No. of 
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Subjects 

term rating 

rating 

undergraduates 

postgraduates 

Low Group 
(n = 11) 

0.93-2.08 

1.64 

8 

3 

High Group 
(n= 11) 

2.13-3.4 

2.75 

5 

6 


Table 1: Summary of thesaurus term ratings 

The average ratings are 1.64 and 2.75 from the low-level group and the high-level group 
respectively. Although the ratings are not high for both groups, a two-tailed, paired t-test 
finds that there is a significant difference between the two groups, with p <0.000. 

The subjects were also divided into two groups based on their level of education: 
postgraduate students and undergraduate students. However, there is no difference between 
these two groups in terms of the term ratings. Therefore, further comparisons are based on 
the high- and low-level groups. 

General results on search behaviour and search effectiveness 

Search behaviour: Number of queries, average number of terms in queries, and average 
number of thesaurus terms in queries 

In this study, the user search behaviour is measured by the number of searches and queries 
for the three search questions, the average number of terms used in queries, and the average 
number of El Thesaurus terms used in queries. As shown in Table 2, the numbers of the 
three measures from the high-level group are all higher than those from the low-level 
group. On average, the participants in the high-level group carried out about fourteen more 
searches (more queries) in total than the low-level group members and they used, on 
average, one more word in their queries. Despite the obvious different results on these two 
measures between the two groups, no statistical significance was found by t-test, at the 
a=0.05 level. In terms of the number of the El Thesaurus terms used in the queries, the two 
groups were at about the same level, with only a slightly higher number (0.34) from the 
high-level group. A detailed analysis on the use of terms in queries is described later . 


Subjects 

Mean no. of 
queries per 
subject 

Mean no. of 
terms per 
query 

Mean no. of thesaurus 
terms per query 

Low Group 
(n = 11) 

20.09 

2.86 

2.22 

High 




Group 
(n = ll) 

34.64 

4.0 

2.56 


Table 2: Comparison of search behaviour of the two groups 

Search Effectiveness: Total number of relevant documents and mean average precision 
(MAP) scores 

We measure search effectiveness by the total number of relevant documents identified by 
the user, and the MAP score for a user on the search results. The first measure focuses on 
the quantity of the relevant documents retrieved and the second measure focuses on the 
quality of the search: how the retrieved relevant documents were ranked in the system 
results. An effective search should have the relevant documents ranked high in the system 
returned list. The results are presented in Table 3. 
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Table 3: Comparison of search performance 


Subjects 

Mean no. of relevant documents 

Mean MAP score 

Low Group (n = ll) 

18.64 

0.488 

High Group (n = ll) 

20.55 

0.59 


As the data show, the subjects in the high-level group retrieved slightly more (1.91) relevant 
documents for the search questions, and obtained a slightly higher (0.1) MAP score. 
However, the differences between the two groups are not statistically significant, tested by 
t-test at a=0.05 level. 

Comparisons of the level of domain knowledge, search behaviour and search 
performance 

The average numbers presented above demonstrate that, in general, the high-level subjects 
tend to do more searches and to use more words in queries. However, the number of 
thesaurus terms used in queries seems to be equal for both groups, and the search 
effectiveness is at about the same level between the two groups. The details of these 
general results can be further explored. 

Figures 1 to 3 compare different measures of search behaviour and search effectiveness 
with the level of domain knowledge, represented by ratings on the thesaurus terms. These 
figures display the trends of search behaviour and search effectiveness along with the level 
of domain knowledge. In all three figures, the subjects on the horizontal axis are ordered 
based on their ratings of the thesaurus terms, which increase from left to right. In Figure 1, 
the measures from the corresponding subjects are average number of terms in a query and 
average number of thesaurus terms in a query, representing the user's search behaviour. In 
Figure 2 mean MAP scores, which represent search effectiveness, are exhibited the term 
ratings. The total number of relevant documents identified and the total number of queries 
used by each subjects are displayed in Figure 3. These two measures are separately part of 
the search behaviour and search performance measures. They are displayed in Figure 3, 
rather than in the other two figures, simply for convenience, because the data scale used by 
the two measures is much. 
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Subjects 


Figure 1: Subjects' scores on the three measures 



Subjects 


Figure 2: Comparison of term rating and MAP 
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Subjects 


Figure 3: Comparison of term rating, no. of queries and no. of relevant documents 

The figures demonstrate more clearly the effects of the domain knowledge on search 
behaviour measures. As we can see from the Figures 1 and 3, as the ratings increase, the 
average number of terms and the number of thesaurus terms used in a query increase, so 
does the total number of queries in Figure 3. The figures actually reveal and explain where 
those differences are from between the two groups. The differences are particularly 
obvious between the subjects whose ratings are at the two ends. The differences are not 
obvious until the ratings are really high. Given the small sample size, the numbers of 
subjects at both ends of the rating dimension are not sufficient enough to make a statistical 
test. But the trends are apparent enough to make a judgment. The results here indicate that 
within a certain range of the level of the domain knowledge, there would be no effect. But 
if the difference is big enough, such as between an expert and novice in a field, the 
different effects would appear. 

The figures also reveal that while the amount of the domain knowledge does seem to have 
an effect on search behaviour, it does not show any effect on the search effectiveness. In 
Figures 2 and 3, despite the increase in term ratings, the MAP scores in Figure 2 and the 
total number of relevant documents identified in Figure 3 are primarily at the same level 
across all subjects. These results also indicate that merely increasing the number of 
searches or using more terms in a query does not necessarily improve the search 
effectiveness. 

Use of terms in query formulation 

Another way to investigate search behaviour is to examine the query terms the participants 
used, as studied by Wildemuth ( 2004 ). Vaakari ( 2003 1 and Allen ( 1991a ), though the focus 
of our study is the effects of domain knowledge on the effectiveness of searches, rather 
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than on search tactics or vocabulary use in searching. We compared the terms the 
participants used in their queries with those used in the search questions and those in the El 
Thesaurus. Our findings are described below. It should be noted that the queries we 
describe here are not all of the queries the participants constructed and conducted: these are 
only the queries that retrieved relevant documents. Those that did not yield any relevant 
documents are not included. 

The first search question (Ql) description was: List three specific engineering concerns 
associated with automotive engine coolants. List two automotive coolant additives and 
describe their function. The most frequently used search terms were automotive, engine, 
coolant(s), and additive(s). Only one subject used truncation for coo/* in order to allow the 
system to retrieve more words, for example, coolant, coolants, and cooling. When the 
Axiom database loads up, the very first screen includes a section on Search Tips. The first 
bullet indicates Use * for truncation. Only one student paid attention to this tip or s/he was 
familiar with truncation from previous search experience. 

Filler words like concerns and problem were used four times each in the query formulation 
by various subjects. Even the word review was used to interrogate the system on the topic 
of automotive engine coolants. It is interesting to note that the subjects did not seem to 
think of synonyms when formulating their queries. The majority used the word automotive 
that came directly from the instructor's question and this is a word that the El Thesaurus 
allows only in conjunction with engineering and fuels. Only five subjects (22.73%) used 
the word automobile in their search and nobody used both automotive and automobile 
connected by the or Boolean operator. 

Of twenty-two students, only two (9.09%) subjects used the COMPENDEX built-in 
thesaurus in order to expand their search for Ql. 

The second question (Q2) description was: What are three of the most promising types of 
fuel cells? Why are they promising? What is the chemistry involved? What types are being 
considered for transportation? The most frequently used terms to formulate the query were 
fuel cell, chemistry and transportation. The term applications was used by two subjects 
(9.09%) in their query formulation in conjunction with fuel cells. El Thesaurus lists the 
term Applications without assigning a classification code for this term. The scope note 
describes it as a very general term and recommends the use of a specific type of application 
(e.g., Aerospace applications, High temperature applications, Industrial applications, etc.). 
( El Thesaurus 1995 : 35). 

It seems that the subjects were not able to select the key words from the instructor's 
question and determine what is the main point of a question. Six students (27.27%) used 
the word promising in their query formulation for Q2. One student (4.54%) used most 
efficient as a synonym for promising and another one used most and promising or best and 
fuel cells as his search strategy. 

Other queries were formulated in natural language as if the searcher was engaged in a live 
conversation. One of the query formulations was how to design a fuel cell. Other queries 
included prepositions, for example, types offuel cells used for transportation. Words like 
types of or classes of were used ten times to interrogate the system. Three subjects 
(13.64%) used the expand feature to conduct the search for Q2, thus making their search 
more efficient. 

The third question (Q3) description was: What are the pollutants that automotive catalytic 
converters are designed to reduce? What are some of the thermal-fluid engineering issues 
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in designing automotive catalytic converters? The most frequently used words for query 
formulation for Q3 were catalytic converters) (spelled convertors by some), pollutants, 
thermal fluid, automotive, and automobile. The words automotive and automobile were 
never connected by or. 

For Q3 the term engineering was used by four subjects (18.18%) in their query formulation 
in conjunction with catalytic converter(s). The El Thesaurus lists the term Engineering 
with the 901 classification code. The scope note presents the term as very general and 
recommends the use of a specific type of engineering, (e.g., Automotive engineering, 
Electrical engineering, High temperature engineering, etc.) ( El Thesaurus 1995 : 236). The 
subjects used the term engineering with the connotation of design. A number of eight 
subjects used the word design in conjunction with catalytic converter(s). The term design 
is listed in the El Thesaurus, with no classification code though. Design is the term used 
for design optimization and some of its related terms are Machine design, Product design, 
Structural design. ( El Thesaurus 1995 : 171). 

Again, filler words such as types of, issues, and considerations were used by a few subjects. 
One student used truncation of all of the words s/he used in the query formulation. Another 
student just typed all the words that came to his or her mind automotive catalytic converter, 
pollutants, catalytic converter design in one search. Another one interpreted Q3 as 
automobile exhaust pollutants blocked by using catalytic converters and simply 
interrogated the system by using this query. The same subject then expanded the search by 
using terms accepted by the built-in thesaurus such as catalytic converters exhaust gases, 
automobiles, and air pollution control, thus enhancing the recall of findings. 

In general, without distinguishing high-level domain knowledge group and low-level 
domain knowledge group, the participants exhibited low familiarity with Boolean operators 
used in online searching. Most of them just typed a string of words with no connectors, 
many of them separated the words by commas, and some used the + sign instead of using 
the Boolean operator and. As already mentioned, the use of truncation was very limited 
and so was the use of the built-in thesaurus and the expand feature of the COMPENDEX 
database. The inability to extract the key words from the instructor's questions resulted in 
the use of irrelevant words in the query formulation. Often use of prepositions in query 
formulations demonstrated some participants' limited searching experience. As for the 
domain-specific terminology one would expect searchers to generate synonyms while 
interacting with the information retrieval System. This particular group of students did not 
make high use of synonyms, most of them limiting their search to the words derived from 
the instructor's questions. 

Discussion and conclusions 

This study sought to answer three research questions: 1). Would the level of domain 
knowledge have a significant effect on the user's search behaviour? 2). Would the level of 
domain knowledge have a significant effect on the effectiveness of search? and 3). What 
would be the relationship between the search behaviour and search effectiveness? The 
results of our study show that the level of domain knowledge has an effect on search 
behaviour: as the level of domain knowledge increases, the user tends to do more searches 
or queries and to use more terms in queries to search for the relevant documents, though 
the effects are not statistically significant. This result is generally consistent with findings 
from previous studies. Wildemuth 1 2004 1 found that the users' search tactics, the patterns 
of term use in queries, changed over time as their domain knowledge changed. Vakkari et 
al., ( 2003 ) found that the average number of terms in query increased as the user's domain 
knowledge increased. Allen (1991a) found that high-knowledge users employed more 
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search expressions than low-knowledge users. All these findings indicate that the first 
research question above can be positively answered. 

Our results do not support a positive answer to the second research question. Although the 
level of domain knowledge changed, the search effectiveness remained the same for all 
participants. This is related to the answer to the third research question: the search 
behaviour does not seem to be related to the search effectiveness. Despite the difference of 
thesaurus term ratings and of search behaviour (More knowledgeable users tend to conduct 
more search sessions for the same search question and use more terms in a query in order 
to find relevant documents), the MAP scores and the total number of relevant documents 
identified are essentially at the same level across all participants. These results also indicate 
that merely increasing the number of searches or using more terms in a query does not 
necessarily improve the search effectiveness. It might be the case that the difference of the 
domain knowledge level between the two groups in this study is not big enough. Therefore, 
the effective terms or words (for example, thesaurus terms) used in the queries, as 
discussed earlier, are approximately the same across the two groups. 

Performing more searches and trying more words in queries for the same search question 
reflect one way of making more effort in seeking relevant information. This extra effort, 
however, did not seem to result in more effective searching. It may be because there is a 
difference between the domain knowledge and the search knowledge, as identified by 
previous researchers (Ycc_J_993). The level of domain knowledge itself may not be 
important unless the user has a certain amount of searching expertise. Bhavnani ( 2002 ) 
found that only domain-specific search knowledge is important for effective searching. The 
use of terms in queries in our study revealed that our participants, as a whole, had limited 
familiarity with database searching in general and low familiarity with COMPENDEX, one 
of the major databases in their field, in particular. Due to this poor searching expertise, the 
extra efforts made by more knowledgeable users were not converted into more effective 
search queries. One of the results supports this: there is no difference between the high- 
level knowledge group and the low-level group in terms of the average number of thesaurus 
terms in queries, although the high-level group tended to use more terms in a query. 
Therefore, the search effectiveness, as measured by the MAP scores and the number of 
identified relevant documents, remained the same for all users. 

Results of this study imply some future directions for information retrieval system designs. 
How can information retrieval systems support more knowledgeable or experienced users 
more effectively during their information search process? More specifically, how can 
information retrieval systems lead these users to more effective searches? From the point of 
view of information retrieval system designs, the results may imply a limitation of the 
current systems in supporting and rewarding the more knowledgeable users. Assuming the 
system could identify the level of domain knowledge from a user, how can the system be 
designed to adapt to this particular user, so this user's extra effort could lead to more 
effective searches? Could the number of queries for the same search question be used as an 
indicator of the searcher's level of domain knowledge? Research is needed to answer these 
questions. 

The small sample size and the specific engineering domain context limit the generalizability 
of the results of this study. Due to the small sample size, the findings reported in this paper 
are considered as exploratory and preliminary, and the results could also be interpreted in 
some other ways. More research needs to be done in order to validate or invalidate these 
findings, using larger samples. The findings are within a specific engineering domain 
context. Studies in other fields may find different results. Further research should include 
more domain contexts and compare the results from each domain. The interactions between 
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domain knowledge and search knowledge should also be explored. 
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APPENDIX 1 

User questionnaire and term rating 

Responses to this questionnaire are used solely for research purposes. All information 
gathered will be handled and used confidentially. 
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Personal Information 

Subject ID:_ 

Age:_ Gender: o Male o 

Female 

You are currently enrolled in: 

School/College:_ 

Faculty/Dept.:_ 

Major: 

Level of education you are currently working towards: 
o Bachelors (undergraduate) o Masters o Doctorate o Other 

Credit hours completed in current program: 


How frequently do you search/use databases? 

o Almost every day o Once a week o Once every two weeks o Once a month 
o Once every three months o Less than once every three months 
o Never use o Other 


When you search for information on a computer system, including Internet search 
engines, which of the following ways do you normally use to conduct your searches? 

o Type in search queries o Browsing o Both o Other 


If you use search queries, do you normally use advanced features that are available on 
many systems? 

o Yes o No 


Term Rating 

Please use the scale below to rate your familiarity with the following terms. 
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Familiarity with TERM 


None 

Have heard of term 
(before taking this test) 

Somewhat familiar 

Quite familiar 

Very familiar 

0 

1 

2 

3 

4 


1. 

Activity (thermodynamics) 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

2. 

Adiabatic engines 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

3. 

Aerodynamics 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

4. 

Air conditioning 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

5. 

Am m onia 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

6. 

Atmospheric radiation 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

7. 

Atmospheric thermodynamics 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

8. 

Bearings-Low temperature 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

9. 

Biomedical engineering 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

10. 

Biomedical engineering- 
Cryotherapy 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

11. 

Blowers 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

12. 

Boiling liquids 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

13. 

Bolometers 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

14. 

Boundary layers 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

15. 

Brayton cycle 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

16. 

Calorific value 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

17. 

Capillary tubes 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

18. 

Carbon dioxide 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

19. 

Cargo handling 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 

20. 

Carnot cycle 

□ 0 

□ 1 

□ 2 

□ 3 

□ 4 


APPENDIX 2 

Post-search evaluation form 

Responses to this questionnaire are used solely for research purposes. All information gathered will be 
handled and used confidentially. 


Subject ID:_ 

Date: Time: 


1. Before this search session, were you aware of the existence of thesauri used by 
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information retrieval systems that can help your search? 

Yes_No_ 

a. If yes, had you used any thesaurus in your searches before? 

Yes_No_ 

b. If yes, was the use of the thesaurus helpful for you to find other terms? 

No opinion 0_Not at all 1_2_3_4_5_Extremely helpful 

For Search Question #1 

2. What information did you use to make relevance judgments on the retrieved documents 
in this study? (check as many as possible) 

_Title_Journal 

_Author_Publication year 

_Abstract Key words 

_Full text Other (please specify):_ 

3. Were you generally satisfied with the information retrieved from the system for your 
search question? Please circle a number below: 

No opinion 0_Not at all 1_2_3_4_5_Extremely helpful 

If not satisfied, please briefly explain why:_ 


4. During your search, you changed your search terms for your queries. How did you come 
up with the new term(s) when you refined your queries? 

Note: New term(s) are those that are different from the term(s) in the initial query. 


New term #1:_ 

The way you came up with it (check as 
many as applicable): 

(1) _from my memory; 

(2) _from the thesaurus; 

(3) _from the relevant documents 

retrieved by the previous query; 

(4) _Other (please explain): 


New term #2:_ 

The way you came up with it (check as 
many as applicable): 

(1) _from my memory; 

(2) _from the thesaurus; 

(3) _from the relevant documents 

retrieved by the previous query; 

(4) _Other (please explain): 


New term #3:_ 

The way you came up with it (check as 
many as applicable): 

(1) _from my memory; 

(2) _from the thesaurus; 

(3) _from the relevant documents 

retrieved by the previous query; 

(4) _Other (please explain): 


New term #4:_ 

The way you came up with it (check as 
many as applicable): 

(1) _from my memory; 

(2) _from the thesaurus; 

(3) _from the relevant documents 

retrieved by the previous query; 

(4) _Other (please explain): 


http://www.informationr.net/ir/10-2/paper217.html[ 11/12/2015 5:01:21 PM] 



































Domain knowledge, search behaviour, and search effectiveness of engineering and science students: an exploratory study 


Repeated For Search Question #2, etc. 
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